Search CORE

24 research outputs found

Asymptotic optimality of a cross-validatory predictive approach to linear model selection

Author: Chakrabarti Arijit
Samanta Tapas
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2008
Field of study

In this article we study the asymptotic predictive optimality of a model selection criterion based on the cross-validatory predictive density, already available in the literature. For a dependent variable and associated explanatory variables, we consider a class of linear models as approximations to the true regression function. One selects a model among these using the criterion under study and predicts a future replicate of the dependent variable by an optimal predictor under the chosen model. We show that for squared error prediction loss, this scheme of prediction performs asymptotically as well as an oracle, where the oracle here refers to a model selection rule which minimizes this loss if the true regression were known.Comment: Published in at http://dx.doi.org/10.1214/074921708000000110 the IMS Collections (http://www.imstat.org/publications/imscollections.htm) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Crossref

Asymptotic Properties of Bayes Risk of a General Class of Shrinkage Priors in Multiple Hypothesis Testing Under Sparsity

Author: Chakrabarti Arijit
Ghosh Malay
Ghosh Prasenjit
Tang Xueying
Publication venue
Publication date: 10/06/2015
Field of study

Consider the problem of simultaneous testing for the means of independent normal observations. In this paper, we study some asymptotic optimality properties of certain multiple testing rules induced by a general class of one-group shrinkage priors in a Bayesian decision theoretic framework, where the overall loss is taken as the number of misclassified hypotheses. We assume a two-groups normal mixture model for the data and consider the asymptotic framework adopted in Bogdan et al. (2011) who introduced the notion of asymptotic Bayes optimality under sparsity in the context of multiple testing. The general class of one-group priors under study is rich enough to include, among others, the families of three parameter beta, generalized double Pareto priors, and in particular the horseshoe, the normal-exponential-gamma and the Strawderman-Berger priors. We establish that within our chosen asymptotic framework, the multiple testing rules under study asymptotically attain the risk of the Bayes Oracle up to a multiplicative factor, with the constant in the risk close to the constant in the Oracle risk. This is similar to a result obtained in Datta and Ghosh (2013) for the multiple testing rule based on the horseshoe estimator introduced in Carvalho et al. (2009, 2010). We further show that under very mild assumption on the underlying sparsity parameter, the induced decision rules based on an empirical Bayes estimate of the corresponding global shrinkage parameter proposed by van der Pas et al. (2014), attain the optimal Bayes risk up to the same multiplicative factor asymptotically. We provide a unifying argument applicable for the general class of priors under study. In the process, we settle a conjecture regarding optimality property of the generalized double Pareto priors made in Datta and Ghosh (2013). Our work also shows that the result in Datta and Ghosh (2013) can be improved further

arXiv.org e-Print Archive

CiteSeerX

Posterior Contraction rate for one group global-local shrinkage priors in sparse normal means problem

Author: Chakrabarti Arijit
Paul Sayantan
Publication venue
Publication date: 04/11/2022
Field of study

We consider a high-dimensional sparse normal means model where the goal is to estimate the mean vector assuming the proportion of non-zero means is unknown. Using a Bayesian setting, we model the mean vector by a one-group global-local shrinkage prior belonging to a broad class of such priors that includes the horseshoe prior. We address some questions related to asymptotic properties of the resulting posterior distribution of the mean vector for the said class priors. Since the global shrinkage parameter plays a pivotal role in capturing the sparsity in the model, we consider two ways to model this parameter in this paper. Firstly, we consider this as an unknown fixed parameter and estimate it by an empirical Bayes estimate. In the second approach, we do a hierarchical Bayes treatment by assigning a suitable non-degenerate prior distribution to it. We first show that for the class of priors under study, the posterior distribution of the mean vector contracts around the true parameter at a near minimax rate when the empirical Bayes approach is used. Next, we prove that in the hierarchical Bayes approach, the corresponding Bayes estimate attains the minimax risk asymptotically under the squared error loss function. We also show that the posterior contracts around the true parameter at a near minimax rate. These results generalize those of van der Pas et al. (2014) \cite{van2014horseshoe}, (2017) \cite{van2017adaptive}, proved for the horseshoe prior. We have also studied in this work the asymptotic optimality of the horseshoe+ prior to this context. For horseshoe+ prior, we prove that using the empirical Bayes estimate of the global parameter, the corresponding Bayes estimate attains the near minimax risk asymptotically under the squared error loss function and also shows that the posterior distribution contracts around the true parameter at a near minimax rate

arXiv.org e-Print Archive

Asymptotic Bayes-optimality under sparsity of some multiple testing procedures

Author: Bogdan Małgorzata
Chakrabarti Arijit
Frommlet Florian
Ghosh Jayanta K.
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 21/11/2012
Field of study

Within a Bayesian decision theoretic framework we investigate some asymptotic optimality properties of a large class of multiple testing rules. A parametric setup is considered, in which observations come from a normal scale mixture model and the total loss is assumed to be the sum of losses for individual tests. Our model can be used for testing point null hypotheses, as well as to distinguish large signals from a multitude of very small effects. A rule is defined to be asymptotically Bayes optimal under sparsity (ABOS), if within our chosen asymptotic framework the ratio of its Bayes risk and that of the Bayes oracle (a rule which minimizes the Bayes risk) converges to one. Our main interest is in the asymptotic scheme where the proportion p of "true" alternatives converges to zero. We fully characterize the class of fixed threshold multiple testing rules which are ABOS, and hence derive conditions for the asymptotic optimality of rules controlling the Bayesian False Discovery Rate (BFDR). We finally provide conditions under which the popular Benjamini-Hochberg (BH) and Bonferroni procedures are ABOS and show that for a wide class of sparsity levels, the threshold of the former can be approximated by a nonrandom threshold.Comment: Published in at http://dx.doi.org/10.1214/10-AOS869 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Crossref

On Density, Threshold and Emptiness Queries for Intervals in the Streaming Model

Author: Bishnu Arijit
Chakrabarti Amit
Nandy Subhas C.
Sen Sandeep
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 35th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2015)
Publication date: 01/01/2015
Field of study

In this paper, we study the maximum density, threshold and emptiness queries for intervals in the streaming model. The input is a stream S of n points in the real line R and a floating closed interval W of width alpha. The specific problems we consider in this paper are as follows. - Maximum density: find a placement of W in R containing the maximum number of points of S. - Threshold query: find a placement of W in R, if it exists, that contains at least Delta elements of S. - Emptiness query: find, if possible, a placement of W within the extent of S so that the interior of W does not contain any element of S. The stream S, being huge, does not fit into main memory and can be read sequentially at most a constant number of times, usually once. The problems studied here in the geometric setting have relations to frequency estimation and heavy hitter identification in a stream of data. We provide lower bounds and results on trade-off between extra space and quality of solution. We also discuss generalizations for the higher dimensional variants for a few cases

Dagstuhl Research Online Publication Server

Model selection for high dimensional problems with application to function estimation

Author: Chakrabarti Arijit
Publication venue: 'Purdue University (bepress)'
Publication date: 01/01/2004
Field of study

The problem of selecting a model in infinite or high dimensional setup has been of great interest in the recent years. The high dimensional problems typically arise when the number of possible parameters increases with increasing sample size, while the infinite dimensional problems are usually nonparametric in nature. In this thesis we consider two such settings and study the behavior of well known model selection criteria and propose new criteria with optimal properties. Using a complete orthonormal basis of L2, the unknown drift function in the White-Noise model can be represented as an infinite linear combination of the basis functions, the coefficients being the (unknown) Fourier coefficients and thus the problem reduces to one of estimating the vector of Fourier coefficients. It is shown that model selection by the Akaike Information Criterion (AIC) or some suitable variants of it (where under each model, all but first finitely many Fourier coefficients are assumed to be zero), followed by least squares estimation, achieve the asymptotic minimax rate of convergence (over an appropriate subset of the parameter space) for squared error loss. A Bayesian Model Selection rule followed by Bayes estimates is also shown to achieve the same rate of convergence asymptotically. A simulation study is then carried out to apply these rules and some other standard techniques in the closely related nonparametric regression problem, and the performances of different estimation procedures are compared. It is known that BIC may be an inappropriate model selection criterion and a poor approximation to integrated likelihoods in some high dimensional problems. We propose a generalization GBIC of BIC, which approximates the logarithm of the integrated likelihood up to O(1) and a Laplace approximation to the integrated likelihood correct up to o(1) in a high dimensional setup when the observations come from the exponential family of distributions. Rates of convergence of the Laplace approximation are found out for specific examples. Extensive simulation results show that GBIC performs much better than BIC and the Laplace approximation performs wonderfully well in many examples, including some non-exponential family examples

Purdue E-Pubs